Feature Engineering and a Proposed Decision-Support System for Systematic Reviewers of Medical Evidence
نویسندگان
چکیده
OBJECTIVES Evidence-based medicine depends on the timely synthesis of research findings. An important source of synthesized evidence resides in systematic reviews. However, a bottleneck in review production involves dual screening of citations with titles and abstracts to find eligible studies. For this research, we tested the effect of various kinds of textual information (features) on performance of a machine learning classifier. Based on our findings, we propose an automated system to reduce screeing burden, as well as offer quality assurance. METHODS We built a database of citations from 5 systematic reviews that varied with respect to domain, topic, and sponsor. Consensus judgments regarding eligibility were inferred from published reports. We extracted 5 feature sets from citations: alphabetic, alphanumeric(+), indexing, features mapped to concepts in systematic reviews, and topic models. To simulate a two-person team, we divided the data into random halves. We optimized the parameters of a Bayesian classifier, then trained and tested models on alternate data halves. Overall, we conducted 50 independent tests. RESULTS All tests of summary performance (mean F3) surpassed the corresponding baseline, P<0.0001. The ranks for mean F3, precision, and classification error were statistically different across feature sets averaged over reviews; P-values for Friedman's test were .045, .002, and .002, respectively. Differences in ranks for mean recall were not statistically significant. Alphanumeric(+) features were associated with best performance; mean reduction in screening burden for this feature type ranged from 88% to 98% for the second pass through citations and from 38% to 48% overall. CONCLUSIONS A computer-assisted, decision support system based on our methods could substantially reduce the burden of screening citations for systematic review teams and solo reviewers. Additionally, such a system could deliver quality assurance both by confirming concordant decisions and by naming studies associated with discordant decisions for further consideration.
منابع مشابه
Anomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors
Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...
متن کاملA Decision Support System for Diagnosis of Diabetes and Hepatitis, based on the Combination of Particle Swarm Optimization and Firefly Algorithm
Introduction: Clinical Decision Support Systems (CDSS) are designed in the form of computer programs that help medical professionals make decisions about disease diagnosis. The main aim of these systems is to assist physicians in diagnosing diseases, in other words, a physician can interact with the system and use them to analyze patient data, diagnose diseases, and other medical activities. Me...
متن کاملA Decision Support System for Diagnosis of Diabetes and Hepatitis, based on the Combination of Particle Swarm Optimization and Firefly Algorithm
Introduction: Clinical Decision Support Systems (CDSS) are designed in the form of computer programs that help medical professionals make decisions about disease diagnosis. The main aim of these systems is to assist physicians in diagnosing diseases, in other words, a physician can interact with the system and use them to analyze patient data, diagnose diseases, and other medical activities. Me...
متن کاملSystem Factors Influencing the Australian Nurses' Evidence-based Clinical Decision Making: A Systematic Review of Recent Studies
Background: There is growing attention to evidence-based practice in Australian clinical contexts and nursing literature. Recent research explores the dimensions of evidence-based practice; however, the implementation of evidence-based clinical decision making has been identified as a cumbersome process. Aim: This study aimed to review the literature syst...
متن کاملA Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)
Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...
متن کامل